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use  with  YACC.  A  PDP-11  computer  system  running  the  UNIX  operating  system  Is 
assumed. 
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ABSTRACT 


This  report  describes  the  use  of  a  computer  program  that  converts 
a  grammar's  production  rules  from  extended  Backus-Naur-Form  to  another 
equivalent  set  of  production  rules  In  ordinary  Backus-Naur-Form  suitable 
for  use  with  the  Yet  Another  Compiler-Compiler  (YACCJ  system.  This  penults 
the  language  designer  to  use  the  far  less  bulky  EBNF  formats,  and  then  to 
automatically  convert  to  BNF  for  use  with  YACC.  A  PDP-11  computer  system 
running  the  UNIX  operating  system  Is  assumed. 
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T  Introduction 

This  report  describes  the  use  of  a  computer  program  that 
converts  grammar  production  rules  in  an  Extended  Sackus-Maur-Form 
(EBNF)  into  ordinary  Backus-Naur-Form  (bmp) .  ebmf  is  very  con¬ 
venient  for  a  human  description  of  a  grammar  but  is  not  in  a  for¬ 
mat  acceptable  to  the  Yet  Another  Compiler-Compiler  (YAOC)  system 
[John751 .  YACC  requires  the  far  more  bulky  format  of  ordinary  BMF 
which  is  inconvenient  for  human  use.  The  program  whose  use  is 
described  here  is  itself  a  translater  written  for  the  YACC  sys¬ 
tem;  the  BNF  it  produces  can  be  used  for  the  input  to  YACC  to 
yield  a  parse  table  and  other  processing  for  the  original  EBNF 
grammar . 

The  EBN*1  to  BMP  converter  program  is  stored  in  the  Naval 
Postgraduate  School  Computer  Sciences  Laboratory  under  the  name 
"ebnftobnf".  It  is  intended  to  work  on  a  PDP-11  under  the  'JNIX 
operating  system.  This  technical  report  may  be  accessed  on  the 
'JM IX  system  by  typing  "man  ebnftobnf". 

IT  The  EBNF  Syntax 

The  EBNF  syntax  acceptable  as  input  to  the  converter  is 
presented  in  this  section.  An  example  grammar  is  also  presented. 

EBNF  makes  use  of  grammar  production  rules  consisting  of  ter¬ 
minals.  nonterminals,  and  a  replacement  operator.  In  the  discus¬ 
sion  that  follows  we  assume  that  terminal  tokens  are  in  uppercase 
letters  or  strings  of  letters  or  are  enclosed  in  single  quotes. 
The  latter  is  usually  reserved  for  trivial  terminals  such  as 


toiaMMifruVri  it." 
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parentheses,  semicolons,  etc.  Monterminals  are  lowercase  letters 
or  strings  of  letters.  The  head  symbol  is  the  nonterminal  "z"  as 
is  the  convention  in  some  textbooks.  The  replacement  operator  is 
the  left  arrow,  written  as  <— . 

Two  sets  of  metasymbols  in  EBMF  must  be  removed  from  the 
grammar  (by  modifying  the  production  rules)  to  produce  an 
equivalent  BNF  grammar.  These  are  the  square  brackets  (...1  mean¬ 
ing  "zero  or  one",  and  curly  brackets  {...}  meaning  "zero  or 
more".  As  is  usual  in  production  rules  the  vertical  bar  I  means 

n  _  _  w 

or  • 

Consider  the  following  example  in  EBMF: 
z  < —  (A1  C 

In  BNF  two  production  rules  are  needed  to  exoress  an  equivalent 
grammar: 

z  <—  C  I  A  C 
or 

z  < —  a'  C 
a'  <—  null  I  A 

In  either  case  the  grammar  accepts  only  the  strings  "C"  or  "AC". 

Consider  the  use  of  the  curly  brackets  to  mean  "zero  or 
more" : 

z  <—  A  (A) 

This  produces  all  the  strings  of  the  form  A,  AA,  AAA,  AAAA ,  anf 
so  the  BMP  equivalent  must  be: 

z  < —  z  A  |  A 
or 

Z  < —  z  A 
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Z  < —  A 

The  advantage  of  using  BBMP  to  describe  a  grammar  is  obvious 
from  these  examples;  it  is  unfortunate  that  YACC  will  not  accept 
a  grammar  in  this  form.  In  the  next  section  the  exact  format  of 
the  E9MF  productions  required  for  processing  by  YACC  is 
presented. 

Ill  Use  of  the  Converter  Program 

In  this  section  a  simple  EBNF  grammar  is  modified  to  the  for¬ 
mat  acceptable  to  YACC,  and  the  grammar  converted  to  BMP  by  the 
translator  Drogram. 

As  an  example  grammer  consider  the  following  production 
rules: 

z  <—  {b}  ; 
b  <—  (Cl  (A1  0 

Here  z  and  b  are  nonterminals  and  A,  C,  O,  and  ;  are  terminals. 
How  might  these  productions  by  modified  to  a  format  acceptable  to 
the  translator  program? 

Several  symbols  must  be  replaced  in  the  SBMF  used  above  to 
make  productions  acceptable  to  YACC.  First,  the  replacement 
operator  must  be  a  colon  (»)  Instead  of  a  left  arrow  (<— ) . 
Secondly,  all  trivial  terminals  (ie.  parentheses,  semicolons, 
etc.)  must  be  enclosed  in  single  quotes  (').  Thirdly,  all  other 
nonterminals  must  be  explicitly  indicated  to  YACC.  Finally,  the 
head  symbol  production  rule  must  be  the  first  (top)  rule. 

The  above  example  production  rules  are  manually  converted  to 


yield  to  followings 

%token  A  C  0 

«% 

2  s  {b}  ' ; ’  ; 
b  :  [Cl  [A]  D  ? 

As  many  of  the  %token  statements  as  needed  can  be  used. 

Mow  consider  the  execution  of  the  SRNF  to  BMP  translator. 
Since  it  is  also  a  YACC  program  input  it  first  must  be  executed: 
yacc  ebnftobnf 

This  produces  a  file  in  your  file  space  named  "y.tab.c.".  The 
next  step  is  to  execute  the  C  program  in  file  "y.tab.c"  by  typ¬ 
ing: 

cc  y.tab.c  -ly 

This  oroduces  a  file  named  "a. out"  that  can  actually  translate 
SBMF  to  BNF  by  the  following  command: 

a. out  <ebnffile  >bnffile 

where  "ebnffile"  is  the  EBNF  inout  file  requiring  translation; 
the  ordinary  BNF  equivalent  will  result  in  file  "bnffile".  Choose 
whatever  names  you  like  for  these  files.  The  apoendix  shows  the 
example  presented  above  before  and  after  translation. 

IV  Using  the  BMP  Equivalent 

In  this  section  the  use  of  the  BNF  equivalent  as  input  to 
another  YACC  process  is  described. 

The  whole  purpose  of  the  EBNF  to  bnp  conversion  process  was 
to  produce  a  set  of  production  rules  acceptable  to  YACC,  and  thus 


be  able  to  build  a  "compiler"  that  can  process  a  "program"  in  the 
grammar  to  produce  either  a  "yes"  or  "no"  answer  as  to  the 
program's  syntax  correctness  or  to  compile  it  to  some  other  tar¬ 
get  language.  To  accomplish  this  the  equivalent  BNF  grammar  must 
be  embedded  among  other  statements  that  indicate  the  terminal 
tokens  and  a  C  program  (possibly  making  use  of  LEX  [Leskl). 

To  do  this  you  must  produce  the  same  list  of  terminals  used 

in  the  conversion  process  (%token  . .  %%) ,  and  prepend  it  to 

the  "bnffile".  One  VERY  IMPORTAMT  production  rule  modification 
must  be  accomplished  prior  to  resubmitting  the  "bnffile".  The 
conversion  process  typically  revises  the  order  of  the  production 
rules  due  to  the  inclusion  of  new  rules  with  new  nonterminals.  Be 
sure  to  insert  the  original  head  symbol  production  rule  back  at 
the  very  too  of  the  list  of  rules;  YACC  requires  this  if  a 
correct  parse  table  is  to  result.  It  may  have  been  moved  down  the 
list  if  it  had  square  or  curly  brackets  in  its  right  hand  side. 
Finally  append  any  C  program  for  processing  the  grammar  into  a 
target  language  to  the  list  of  production  rules;  separate  them  by 
a  %%  delimiter  line.  See  the  YACC  manual  for  details. 

VII  Conclusion 

This  report  describes  how  to  convert  a  EBMF  grammar  to  BWF 
suitable  to  YACC.  While  the  program  has  been  tested  and  found  to 
work  satisfactorily  the  usual  disclaimer  as  to  correctness  must 
be  made.  The  conversion  process  yields  new  production  rules  with 
new  nonterminals.  These  new  nonterminals  are  formed  by  con- 


-  s 


catenating  the  original  nonterminals  with  prefixes  such  as  "fst.” 
and  ''opt.”,  and  the  results  for  a  complicated  grammar  can  get 
quite  long.  Use  the  editor  to  shorten  them  up  if  desired,  but 
preserve  the  uniqueness  of  each  nonterminal.  Some  nonterminals 
may  contain  sequences  such  as  these  are  acceptable  to  Y^CC 
and  so  may  be  left  unchanged. 
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%to!<en  A  C  D 
%% 

2  :  {b}  ; 

b  :  (Cl  (AT  0  ; 


E 
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****  The  following  is  an  example  output  file  (bnffile) .  **** 
****  Note  that  the  first  two  rules  must  be  interchanged  **** 
****  if  if  is  t0  f,e  used  as  part  of  a  YAC C.  inDut  via  **** 
****  the  a. out  process.  **** 
****  Note  the  null  oroduction:  fst.b. mull  I fst.b.  b  ;  **** 


fst.b. : 
z: 

opt.C. : 
opt. A. : 


fst.b.  b  ; 
fst.b.  b  ' ; '  ; 

C  ; 

A  ; 


b: 


opt.C.  opt. A.  0 
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****  Following  is  a  listing  of  the  "ebnftobnf"  program  **** 


%token  SYMBOL  LITEh^L 
%{ Idef ine  NULL  0 
struct  node 
{  #char  symbol (301; 
struct  node  *first; 
struct  node  *next; 

> ; 

char  symbol (301; 
struct  node  *on; 

%} 


grammar: 
rule  list: 


rule : 


rule_iist; 

rule 

rule  list  rule; 


nonterm  alternative_list 

■  {  printf  C'9s%c0  ”,  $l->symbol, 

for  (on  *  S3;  on  !=  MULL ;  pn  *  pn->next) 
[  pitems  (pn->f irst) ; 

if  (pn->next  **  NULL)  Drintf  (”  "); 
else  printf  (”0|  "); 

) 

printf  ( " ; 0 ) ; 

} 

nonterm: 


SYMBOL 

*  {  SS  »  ncreate  (symbol,  NULL,  MULL); 

alternative_list: 

alternative 

-  {  SS  *  ncreate  (”a”,  $1,  NULL); 

} 

I  alternative^list  ' !'  alternative 

*  {  last  (Sl)->next  *  ncreate  ("a”,  S3, 

} 

alternative: 


1 

{  SS  »  ncreate  (”  ",  MULL 
} 

element_list; 

eiement_list: 

element 

1  element 

list  element 

* 

|  last  (Sl)->next  *  S2; 

element: 

t _  .  r  _  . _ 

- —  ■ — 


MULL) ; 
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SYMBOL 


a 

f 

} 

$$ 

*  ncreate 

(symbol , 

NULL, 

NULL) ; 

LITERAL 

a 

{ 

$$ 

*  ncreate 

(symbol , 

NULL, 

NULL) ; 

} 


I  * [ ’  element  list  ' ] ' 

*  {  ?$  *  ncreate  ("o",  $2,  NULL); 

if  ( ! lookup  ($$)) 

{  printf  ("0); 
pitem  ($$); 
printf  ("%c0  ", 
pitems  ($2); 
printf  (";“); 

} 

) 

I  element_list  *}* 

-  {  $$  *  ncreate  ("l",  $2,  NULL); 

if  ( Hookup  ($$) ) 

{  printf  C'O); 
pitem  ($$); 
printf  ("%c0  ",  •  :*); 
pitem  ($$); 
printf  ("  "); 
oitems  ($2); 
printf  {";”); 

? 

} 


«« 

Idefine  LETTER  'a* 
Jdefine  DI1IT  ’O' 


yylex  () 

{ 

int  i,  t,  getchO; 
char  c; 


i 


while  ((c  *  getchO)  *■  '  '  II  c  »«  '0  II  c  ■■  ''); 
if  (type  (c)  »»  LETTER) 

{  i  »  0; 
symbol (i++l  ■  c; 

while  ((t  ■  type  (c  *  symbol[i++l  «  getchO))  *«  LETTE 
lit--  OIOTT  ||  c  =*  ||  c  ■*  •.’); 

ungetch(c); 

symbol!— il  -  ’  '; 
return  (SYMBOL); 

} 

else  if  (c  ■■  ’ '  • ) 

{  i  ■  0; 

symbol  U++1  -  c;  * 

while  ((c  »  symbol [i++l  ■  getchO)  !■  "’); 
symbol (il  ■  '  ’> 
return  (LITERAL) ; 


} 

else  return  (c); 

} 

type  (c) 
char  c; 

{ 

if  (c  >■  'a'  &&  c  <*  ‘z*  I  I  c  >*  'A'  &&  c  <*  '!')  return  (LETTER) 
if  (c  >«  'O'  &&  c  <*  *  9  * )  return  (DIGIT); 
return  (c); 

> 

ncreate  (string,  first,  next) 
char  ^string; 

struct  node  *first,  *next; 

{ 

struct  node  *p; 

p  =  alloc  (40)  ; 
strcoy  (o-^symbol ,  string); 
p->first  *  first; 
p->next  *  next; 
return  (p) ; 

last  (no) 
struct  node  *no; 

{ 

struct  node  *p; 

for  (p  *  np;  p->next  !*  NULL;  p  <*  p->next)  ; 
return  (p) ; 

} 

strcpy  (s,  t) 
char  *s,  *t; 

{ 

while  (*s++  »  *t++) ; 

> 

pitems  (np) 
struct  node  *no; 

{ 

struct  node  *p; 

for  (p  *  np;  p  l*  NULL;  p  *  p->next) 

{  pitem  (p); 
printf  ("  "); 

} 

oitem  (np) 
struct  node  *no; 

{ 

if  (no->first  ■■  null)  printf  ("%s",  np->symbol) ; 
else 

{  if  (strcmp  (np->synhol,  "o")  ■»  0)  orintf  (’’opt”); 
else  printf  ("fst”) ; 
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oiist  (no->first)  ; 

} 

} 

plist  (no) 
struct  node  *no; 

{ 

while  (no  !=  mtjll) 

{  if  (no->first  ==  v»JLL) 

if  (* (no-^synbol )  ==  ' '*)  orintf  ("._"); 
else  printf  (".%s" ,  np->symbol); 
else  if  (strcmp  (np->symbol,  ”o")  *»  0) 

{  printf  ("..opt”); 

plist  (no->f irst) ; 

) 

else 

{  printf  (".  .1st")  ; 
plist  (np->first); 

} 

no  *  np->next; 

} 

orintf  ("."); 

} 

strcoo  (s,  t) 
char  s(l,  t(1 ; 

{ 

int  i; 
i  »  0; 

while  (sUl  **  t(il) 

if  (s(i++1  »*  '  ')  return  (0); 
return  (s[il  -  t[il); 

} 

char  buf [11 ; 
int  bufp  0; 
getch  () 

{ 

return  ((bufp  ■■  fl)  ?  getcharO  :  buf[ — bufpl); 

} 

ungetch  (c) 
int  c; 

{ 

buf[bufp++l  »  c; 

> 

^define  TRUE  1 
fdefine  FALSE  0 

struct  node  *newnonterm  [1001; 
int  nonew  0; 

lookup  (np)  i 

struct  node  *no; 

{ 

int  i; 

for  (i  ■  0;  i  <  nonew;  i++) 


A 
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> 

equal 

struct 

{ 


} 

eqlist 

struct 

{ 


) 

eqtype 

struct 

{ 


} 


Haii  i  n  irillT 


if  (equal  (np,  newnonterm f i 1 ) ) 
return  (TRUE) ; 
newnonterm (nonew++l  ■  no; 
return  (FALSE); 


x,  y) 

node  *x,  *y; 

if  (strcmp  (x->symbol,  y->symbol)  !*  0)  return  (FALSE); 
else  return  (eqlist  (x->first,  y->first)); 


(x#  y) 

node  *x,  *y; 

while  (x  ! *  MULL  &s  y  !»  NULL) 

{  if  (leqtyoe  (x,  y))  return  (FALSE) ; 

if  (strcmp(  x->symbol,  y->symbol)  !»  0)  return  (FALSE) 
if  (x->f irst  !■  NULL) 

if  (leqlist  (x->first,  y->first))  return  (FALSE) ; 
x  »  x->next; 
y  *  y->next; 

} 

if  (x  !■  y)  return  (FALSE); 
else  return  (TR'JE); 


(x,  y) 

node  *x,  *y; 

if  (x->first  **  N'JLL)  return  (y->first  ••  NULL); 
if  (y->f irst  *»  NULL)  return  (FALSE); 

if  (* (x->symbol)  ■■  ’o')  return  (* (y->symbol)  »*  ’o’); 
return  (* (y->symbol)  *»  ’I'); 
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